An Effective Approach for Compression of Bengali Text
نویسندگان
چکیده
In this paper, we propose an effective and efficient approach for compressing Bengali Text. This paper focuses on a methodical study on Bengali text compression techniques. The main target of this research is to provide a framework for Bengali text compression; which ensures a simple and computationally inexpensive effective scheme for Bengali text compression. The proposed Bengali text compression scheme is aimed to encompass the low-overhead data communication and management framework for battery powered, energy constrained devices. The current approaches of data compression and their correspondence, usage and efficiency for compressing Bengali text are also presented in this paper. The comparative analysis of existing compression techniques and proposed approach in terms of time and space complexity along with compression ratio has been integrated. We also present an effective scheme for constructing the training platform or knowledgebase for obtaining compression, as there is no specific tertiary dictionary based Bengali text compression scheme is available for research. The main aspect of the proposed scheme is the integration of string ranking scheme for indexing the source text to achieve hierarchical compression scheme. Static coding is also employed in the proposed scheme to encode the data for compression. This paper also incorporates power consumption analysis of proposed compression scheme along with the performance analysis in terms of compression ratio.
منابع مشابه
Design and Analysis of an Effective Corpus for Evaluation of Bengali Text Compression Schemes
In this paper, we propose an effective platform for evaluation of Bengali text compression schemes. A novel scheme for construction of Bengali text compression corpus has also been incorporated in this paper. A methodical study on the formulation-approaches of text corpus for data compression and present an effective corpus named Ekushe-Khul for evaluating the Bengali text compression schemes h...
متن کاملPerformance Improvement Of Bengali Text Compression Using Transliteration And Huffman Principle
In this paper, we propose a new compression technique based on transliteration of Bengali text to English. Compared to Bengali, English is a less symbolic language. Thus transliteration of Bengali text to English reduces the number of characters to be coded. Huffman coding is well known for producing optimal compression. When Huffman principal is applied on transliterated text significant perfo...
متن کاملAn Enhanced Static Data Compression Scheme Of Bengali Short Message
This paper concerns a modified approach of compressing Short Bengali Text Message for small devices. The prime objective of this research technique is to establish a lowcomplexity compression scheme suitable for small devices having small memory and relatively lower processing speed. The basic aim is not to compress text of any size up to its maximum level without having any constraint on space...
متن کاملMetrics for Bengali Text Entry Research
With the intention of bringing uniformity to Bengali text entry research, here we present a new approach for calculating the most popular English text entry evaluation metrics for Bengali. To demonstrate our approach, we conducted a user study where we evaluated four popular Bengali text entry techniques. Author
متن کاملBengali and Hindi to English Cross-language Text Retrieval under Limited Resources
This paper describes our experiment on two cross-lingual and one monolingual English text retrievals at CLEF in the ad-hoc track. The cross-language task includes the retrieval of English documents in response to queries in two most widely spoken Indian languages, Hindi and Bengali. For our experiment, we had access to a HindiEnglish bilingual lexicon, ’Shabdanjali’, consisting of approx. 26K H...
متن کامل